Processing “Computed” Texts

نویسنده

  • Jean-Michel Hufflen
چکیده

This article is a comparison of methods that may be used to derive texts to be typeset by a word processor. By ‘derive’, we mean that such texts are extracted from a larger structure, which can be viewed as a database. The present standard for such a structure uses an XML-like format, and we give an overview of the available tools for this derivation task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical Natural Language Processing Method for Variant Texts Segmentation

It is well known that some techniques have already been developed to automatically subdivide texts into multiparagraph subtopic passages, such as TextTiling methodology proposed by Hearst. However, an additional algorithm is needed to perform a similar task for parallel or variant texts, because ambiguous and complicated traces of cross citation among them might often generate some sinuous patt...

متن کامل

Computing semantic relatedness of words and texts in Wikipedia-derived semantic space

Adequate representation of natural language semantics requires access to vast amounts of common sense and domain-specific world knowledge. Prior work in the field was either based on purely statistical techniques that did not make use of background knowledge or on huge manual efforts, such as the CYC projects. Here we propose a novel method, called Explicit Semantic Analysis (ESA), for finegrai...

متن کامل

Automatic Text Decomposition and Structuring

Sophisticated text similarity measurements are used to determine relationships between natural-language texts and text segments. The resulting linked hypertext maps are used to identify different text types and text structures, leading to improved text access and utilization. Examples of text decomposition are given for expository and non-expository texts. The vector processing model of retriev...

متن کامل

Memory-based language processing: psycholinguistic research in the 1990s.

There are two main domains of research in psycholinguistics: sentence processing, concerned with how the syntactic structures of sentences are computed, and text processing, concerned with how the meanings of larger units of text are understood. In recent sentence processing research, a new and controversial theme is that syntactic computations may rely heavily on statistical information about ...

متن کامل

Accelerating Boyer Moore Searches on Binary Texts

The Boyer and Moore (BM) pattern matching algorithm is considered as one of the best, but its performance is reduced on binary data. Yet, searching in binary texts has important applications, such as compressed matching. The paper shows how, by means of some pre-computed tables, one may implement the BM algorithm also for the binary case without referring to bits, and processing only entire blo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010